On Combining Vocal Tract Length Normalisation and Speaker Adapation for Noise Robust Speech Recognition
نویسندگان
چکیده
This paper investigates the combination of vocal tract length normalisation and speaker adaptation in connected digit recognition. In particular, we focus on performing this task under a continuously varying car noise environment. Continuous supervised speaker and environment adaptation is carried out on the test data according to the Bayesian framework. The paper also evaluates various approaches to implement vocal tract length normalisation. The best performance was obtained when the normalisation was performed during both initial speaker-independent training and testing. It was also noticed that, during testing, speaker specific normalisation produced better results than utterance specific normalisation. Our experimental results on the connected digit database show that the joint approach outperforms the system in which on-line Bayesian speaker adaptation is performed on HMM mean parameters. The performance gain was particularly high with so called outlier speakers for whom adaptation is truly needed.
منابع مشابه
On combining vocal tract length normalisation and speaker adaptation for noise robust speech recognition
This paper investigates the combination of vocal tract length normalisation and speaker adaptation in connected digit recognition. In particular, we focus on performing this task under a continuously varying car noise environment. Continuous supervised speaker and environment adaptation is carried out on the test data according to the Bayesian framework. The paper also evaluates various approac...
متن کاملHigh performance connected digit recognition through gender-dependent acoustic modelling and vocal tract length normalisation
Large inter-speaker variability of speech is one of the major sources which degrade the performance of state-of-the-art speech recognition systems. During the recent years, several methods, including gender-dependent acoustic modelling and vocal tract length normalisation, have been developed to reduce this variability. In this paper, we first investigate these two methods individually and prop...
متن کاملتخمین سریع ضرایب پیچش در هنجارسازی طول مجرای صوتی با استفاده از امتیاز به دست آمده از مدلسازی تشخیص جنسیت
The performance of automatic speech recognition (ASR) systems is adversely affected by the variations in speakers, audio channels and environmental conditions. Making these systems robust to these variations is still a big challenge. One of the main sources of variations in the speakers is the differences between their Vocal Tract Length (VTL). Vocal Tract Length Normalization (VTLN) is an effe...
متن کاملExperiments in speaker normalisation and adaptation for large vocabulary speech recognition
This paper examines techniques for speaker normalisation and adaptation that are applied in training with the aim of removing some of the variability from the speaker independent models. Two techniques are examined: vocal tract normalisation (VTN) which estimates a single \vocal tract length" parameter for each speaker and then modi es the speech parameterisation accordingly and speaker adaptiv...
متن کاملEfficient Speaker and Noise Normalization for Robust Speech Recognition
In this paper, we describe a computationally efficient approach for combining speaker and noise normalization techniques. In particular, we combine the simple yet effective Histogram Equalization (HEQ) for noise compensation with Vocal-tract length normalization (VTLN) for speaker-normalization. While it is intuitive to remove noise first and then perform VTLN, this is difficult since HEQ perfo...
متن کامل